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ABSTRACT 

Measurements of the equation of state of dark energy from surveys of thousands of Type la 
Supernovae (SNe la) will be limited by spectroscopic follow-up and must therefore rely on pho- 
tometric identification, increasing the chance that the sample is contaminated by Core Collapse 
Supernovae (CC SNe). Bayesian methods for supernova cosmology can remove contamination 
bias while maintaining high statistical precision but are sensitive to the choice of parameteriza- 
tion of the contaminating distance distribution. We use simulations to investigate the form of the 
contaminating distribution and its dependence on the absolute magnitudes, light curve shapes, 
colors, extinction, and redshifts of core collapse supernovae. We find that the CC luminosity 
function dominates the distance distribution function, but its shape is increasingly distorted as 
the redshift increases and more CC SNe fall below the survey magnitude limit. The shapes and 
colors of the CC light curves generally shift the distance distribution, and their effect on the 
CC distances is correlated. We compare the simulated distances to the first year results of the 
SDSS-II SN survey and find that the SDSS distance distrib utions can be rep r oduce d with sim- 
ulated CC SNe that are ~ 1 mag fainter than the standard IRichardson et al.1 |2002h luminosity 



functions, which do not produce a good fit. To exploit the full power of the Bayesian parameter 
estimation method, parameterization of the contaminating distribution should be guided by the 
current knowledge of the CC luminosity functions, coupled with the effects of the survey selection 
and magnitude-limit, and allow for systematic shifts caused by the parameters of the distance 
fit. 

Subject headings: cosmological parameters — dark energy — supernovae: general 



INTRODUCTION 



Type la Supernovae (SNe la) can potentially enable the most precise measurement of the equation 
of state of dark energ y at low to interm ediate redshifts. Future grou nd-based surveys wil l collect thou - 
sands (Pan-STARRS (|Kaiser et allbooj . DES (jBernstein et alj|2009h ) to millions (LSST (jTvsonl 120021) ) 
of supernova light curves in this redshift range; however, obtaining spectra of such a large number of 
candidates will be prohibitively expensive. Thus photometric methods are likely to be used to identify 
all but only a few percent of transients that are tagged f or spectroscopic follow-up. Many photometric 



identification methods for supernovae have been proposed (Dahlen fc Goobar 



Gal- Yam et"alll2004 [johnson k, Crottsll2006l; ISullivan et~al 
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20071 : IConnollv fc Connollvl liooj iRodnev fc Tonrvl [2009L IScolnic et allbood : bong et ailboioh . but none 
purport to be as robust as spectroscopic identification. Photometrically-identified supernova samples will 
suffer from both reduced purity (giving how many objects in the sample are actually SNe la) and reduced 
completeness (giving the amount of SNe la that survive the various cuts and make it into the final sample) . 
A sample that is contami nated by Core Collapse (CC) SNe or other objects - possibly by as little as 2-5% of 
SNe Ib/c (|Homeierll2005j ) - will introduce biases in the recovered cosmology parameters, and removing true 
SNe la from the sample will reduce the statistical precision of the measurement. 

A high precision measurement of dark energy from a large sample of photometrically-identified super- 
novae will not be possible unless these issues are addressed. To that effect, a SN photometric classification 
challenge has been i ssued in an attemp t to test the relative merits of various classification and photo- z 
estimation methods ( Kessler et alj|2010l) . Additionally, the cosmological measurements of phot ometrically- 
i denti fied SN samples can be improved by the Bayesian parameter estimation method developed bv lKunz et al. 
(2007). This method uses all of the candidate objects, allowing for the possibility that each object belongs 
to one of two (or more) species, each of which has a different probability distribution describing the apparent 
distances they are likely to have (at a given redshift) when fit as a SN la. The individual probability that 
each object is a SN la (or another species) is used to discriminate between whether the object should be 
treated as a la, drawn from a narrow distribution, or is a contaminant, drawn from some other distribution. 
This is an alternative to imposing cuts on the sample, which will reduce the statistical precision of the 
measurement and still may not leave the sample 100% pure, resulting in bias. 

The Bayesian method requires some assumption about the probability distribution of the distances of 
both the la and the contaminating CC SNe (which, we should stress, are not real distances). For SNe la, this 
is the usual likelihood function, where the deviation from the true distance is described by a Gaussian function 
with a dispersion given by the intrinsic variation of SN la luminosities (~ 0.15 mag) and the measurement 
error added in quadrat ure (and additionally a dispersion due to the SN redshift uncertainty and peculiar 
velocity may be added (jKessler et al.ll2009al )). The contaminating distribution is not as straight-forward; it 
will likely include objects of different CC supernova types, each with different luminosity functions (though 
the method allows for multiple probability distributions), and these "distances" are derived by fitting the 
non-la light curves with a set of SN la light curve parameters. This is inherent in the problem, since we do 
not know ahead of time to which class each object belongs. 

Thus the contaminating distribution at a given redshift will be a complex function of the contaminants' 
absolute magnitudes, intrinsic colors, extragalactic extinction, light curve shapes, and how these alter the 
predicted distance when described by a SN la model. Additionally, this distribution may not be the same 
at every redshift. Though the contaminating distribution does not need to be known exactly - it may 
be parameterized with any number of parameters which are marginalized over at the end - the choice of 
parameterization should follow the functional form of the contaminating distances derived for the non-la 
objects. 

This paper attempts to characterize the distribution of contaminating distances. To do this, we run a 
series of simulations to determine the contaminating distribution as a function of redshift, contaminant su- 
pernova type, and the contaminants' absolute magnitudes, extinction, and light curve shapes. By comparing 
input parameters to those determined by the distance fitter for a set of realistic simulated light curves, we 
may determine how each component of the output distance contributes to the final distance function, and 
how this function changes as the simulated data get noisier at higher redshifts. The hope is that this study 
will guide future supernova cosmology endeavors as they attempt to extract the dark energy equation of 
state out of the thousands to millions of observed supernovae. 
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2. METHOD 

In this section we present a brief overview of the Bayesian supernova cosmology framework, motivate 
our approach to investigating the core collapse distance modulus residual distribution, and describe the 
simulations used to carry out this study. 



2.1. Bayesian Estimation Applied to Multiple Species 



The presence of non-la objects in the sample means that the data is drawn from multiple probability 



distrib utions instead of one distribution describing the SN la distances (at a given redshift). iKunz et al 



(|2007l ) have developed a method to deal with just this issue, which they call BEAMS: Bayesian Estimation 
Applied to Multiple Species. In the BEAMS framework the data are fit using a posterior which weights the 
likelihood by the probability that each object is a SN la (see lKunz et al.1 (|20071 ) for a full derivation): 



P{0\Hi) OC Pi CiJ a + (1 - Pi) A,„on-Ia, 



with 



<^i,Ia — 



,-( W -/ith(0))72c 



(1) 



(2) 



where supernova i has a probability Pi of being type la, a measured distance modulus fii , an error Ui which 
includes the measurement error and the intrinsic dispersion of la magnitudes added in quadrature, and a 
redshift Zi which is used to calculate the theoretical distance modulus Mth($) for the cosmological parameters 
given by 9. £oa is the usual likelihood function for SN la distances, weighted by Pi, and /^non-ia is the non- 
la likelihood that describes the contaminating distribution, weighted by (1 — Pi). The parameters describing 
the contaminating distribution are added to the variables to be fit and marginalized over at the end. The 
probability Pi that candidate i is a Type la SN can be calculated by comparing the light curves to templates, 
or by some other method, and then fed directly into the cosmological parameter estimation instead of being 
used to make cuts in the sample. Alternatively, a global probability parameter may b e introduced and 
marginalized ove r in the case that the individual probabilities are unknown or uncertain (|Kunz et al.ll2007 : 
Gong et alllioioh . 



2.2. The Contaminating Distance Function 

The "distance" to a contaminant object is determined by fitting a set of multi-band, multi-epoch apparent 
magnitude light curves to SN la templates described by the parameters (t ma x, Ay, A, fi), where t max is the 
epoch of maximum light, Ay is the extinction, and A is the MLCS2k2 parameter describing the luminosity- 
width relation for SNe la. The peak apparent magnitude of a Type la supernova template, for observer-frame 
filter y, is given by 

m y j a = M v ,u + Mia + -PyA + Q V A 2 + A V j a + K Vy j a , (3) 
while that of an observed CC supernova is given by 

m y,CC = M v ,cc + Mcc + A v ^cc + K Vy ^ C c- (4) 

The constants My,i a — —19.5 + 5 log(i ?o/65), Py = 0.736, and Qv — 0.182 have b een determined from 
a training set of SN la light curves |jha et al. 2007 ): the cross- filter K-corrections ( Kim et al. 1996h are 



calculated from a set of spectral templates during the fit; and the SN redshift is an input to the fitter. (Note 
that throughout this paper we assume we have accurate knowledge of the redshifts, either from spectroscopy 
of the SN itself or its host galaxy.) Rearranging the above two equations and subtracting one from the other, 
we get the distance modulus residual for a contaminating supernova, 



Mia - Mcc = ™ia - m C c - (Ma - M C c) - (^ia - ^cc) - /(A) - (Ki a - K cc ) (5) 
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(dropping the filter designations), where /(A) = 0.736A + 0.182A 2 and we set A = for CC SNe. This 
distance residual is a function of the true values for each supernova, designated by CC; the best-fit parameters 
from the distance fitter, /ii a , Ai a , A, (and i max ); and other quantities, such as the K-correction, calculated 
internally by the distance fitter. Thus the distance residual for a given supernova has several components: 

6(i = 8m - 5M -6A-6A- 6K, (6) 

where 6 denotes the output or calculated value minus the true value. For a set of contaminating CC SNe, the 
distribution of their distance residuals is exactly the contaminating distribution which must be parameterized 
in the Bayesian supernova cosmology formalism. 

For a well-observed (good signal-to-noise) set of SNe la, the distribution of distance residuals is a function 
of the intrinsic scatter of the la luminosities (after accounting for the luminosity-width relation) and the 
residuals of the other fit parameters, which would be distributed normally about their true values. When CC 
light curves are fit with la templates, however, there is little reason to suspect that the residuals of the fit 
parameters are normally distributed. For example, the <5A distribution will likely depend on the distribution 
of the shapes of the different CC SNe in the sample: A describes the SN la width-luminosity relation, 
wher eby brighter (A < 0) SNe have broader light curves and fainter (A > 0) SNe decline faster dPhillipj 



1993). Additionally, the 5 A distribution will likely depend on the distribution of CC colors, since the color 
excess is measured with respect to the la template color, and it is unlikely that the CC colors are normally 
distributed about the la colors. 

We can examine how the distribution of each residual individually affects the distribution of distance 
residuals by allowing one component to vary while fixing the other residuals to zero. For SM this is done by 
setting the CC absolute magnitudes to Mi a in the simulations, for 8 A this is done by fixing the parameter to 
the true value in the distance fit, and for 8 A we set A = (for CC SNe) or the true value (for SNe la) in the 
distance fit. We do not attempt to fix the 8m — 8K component (which we call the zero-point residual), which 
is caused by the error in the time of peak brightness i max , the erroneous calculation of the K-corrections, 
and general random noise in the apparent magnitude light curves. Correlations may be investigated as we 
allow multiple components to vary, and we also look at how the residual distributions vary with redshift and 
CC type. 



2.3. The Simulations 



This section describes the parameters and methods used to produce the simu lated observations for this 
study. During the course of this work the supernova analysis package SNANA0 |Kessler et al. 2009b ) 



was 



released which, in addition to simulating supernova surveys, also contains light curve fitting and cosmological 
analysis software. While SNANA is well-suited to forecasting future surveys and for general analyses of large 
SN samples, our simulations are tailored to study the specific effects of fitting CC SN light curves to la 
templates as would happen in a contaminated sample of supernovae. 

We simulate SN observations using the SN la Branch-normal, SNIb/c, SN ILL, and SN IIP spectral and 
light curve templates of P. Nugent@. The template SEDs are interpolated between epochs where necessary 
and are integrated to create rest-frame UBVRI light curves when they do not otherwise exist. In addition, 
we create Ib/c, ILL, and IIP templates based on SNe observed by the Carnegie Supernova ProjeclH. The 
Ib/c, IIL, and IIP CSP templates are based on SN2004fe, SN2004ex, and SN2004er respectively; these were 
chosen because they had well-observed, good signal-to-noise light curves with sufficient pre- and post-peak 



^ttp:// www. sdss . org/supernova/SNANA .html 

: //supernova. lbl . gov/nugent/nugent_templates .html 
3 http : //cspl . lco . cl/~cspuser 1/PUB/CSP . html 



-5 - 



coverage. Figure □ shows the rest-frame SN la (for two values of A), SN Ib/c, SN IIL, and SN IIP (both 
Nugent and CSP) template light curves. 



SN la Template Light Curves 
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Fig. 1. — Template UBVRI light curves for both la and CC SN types. The SN la light curves are shown 
for A = —0.3 (blue) and A = 0.4 (green), and the CC light curves show both the CSP (green) and Nugent 
(blue) templates. 

We give each SN la a rando m luminosity parameter A, drawn from an empirical distribution of observed 
local SNe la (Hick en et al . 2009); to avoid biases in the distribution due to faint (high A) objects falling below 
the magnitude limit of the survey, we define the distribution using only objects with z < 0.04. Additionally, 
we add an i ntrinsic luminosity variation with a dispersion of 0.12 mag and a color variation with a dispersion 
of 0.05 mag (|Nobili et al.ll2003h . These dispersions were chosen such that a simulated sample of well-observed 
(high signal-to-noise) SN la light curves produce distances with a dispersion of about 0.15 mag, matching 
what is found in the literature. The CC SN absolute magnitudes, unless they are set to M \ ai are drawn 
from Gaussian distributions with means and dispersions taken from iRichardson et al.l (|2002l ) (see Table [TJ; 
additionally, depending on the specific simu lation, the CC SNe may have an added color variation with a 
dispersion of 0.15 mag ((Sullivan et alJbood) . 

Host galaxy extinction is added using to the extinction law of Cardelli et al. ( 19891 ) (updated by O'DonneH 



(|1994r 0. The value of Ay for each SN is drawn from an exponent ial distribution (un less it is set to a specific 
value) with an average of 0.35 for CC SNe and 0.30 for SNe la (IHatano et al. 1998 ). The total-to-selective 
extinction ratios used are R v = 3.1 for CC SNe and R v = 2.3 for SNe la |Smithll2009l ). We find that a 35% 
change in Ry only shifts the output distances by as much as 0.2%, thus our conclusions are insensitive to 
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SN Type Fraction Mb (mag) gm (mag) 



Ib/c 


0.24 


-17.63 


1.4 


IIL 


0.31 


-17.63 


0.9 


IIP 


0.45 


-16.63 


1.1 



Table 1: Core Collapse supernova types , intrinsic fractions, and luminosity characteristics. Fractions are 
modified from lDahlen fc Frannsonl (|1999l ). which include s sub-populations and Ty pe Iln SNe that we ignore 
here; Mb is the peak absolute B-band magnitude from iRichardson et al. ( 2002 ) , rescaled to our Hq = 71 
km s _1 Mpc -1 ; and ctm is the dispersion in peak absolute magnitude (jRichardson et al.l 120021 ) . 



choice of Ry. 

Finally, we redshift the magnitudes to the observed frame by adding the distance modulus, calculated 
at each redshift for a fiducial set of cosmological parameters (where Hq = 7 1 km s -1 Mpc -1 , f?M = 0.26, 
&de = 0.74, and w = —1), and the cross-filter K-corrections, calculated as in ( Kim et al. 19961 ) as a function 
of redshift and epoch; the rest frame epochs are also dilated by (1 + z). We choose to simulate SNe in the 
SDSS-II Supernova survey, instead of a larger planned survey such as Pan-STARRS or LSST, so that we 
may check our simulations against existing data (see Section . Supernovae are simulated at 8 discrete 
redshifts: 0.075, 0.125, 0.175, 0.22 5, 0.275, 0.325, 0.375 , and 0.425. Simulated observations are in the form 
of SDSS g, r, and i light curves ( Fukugita et al. 1996I ). with noise added in flux space to the calculated 
apparent magnitude light curves. The signal-to-noise ratio for each observation is calculated using SDSS-II 
telescope and instrument para meters, and the c alculations are calibrated so that they give the stated 5<r 
magnitude limit in each filter (|Gunn et al.lll998l ). The SDSS-II cadence is also matched, and some fraction 
of epochs i s removed to simula te poor weather, resulting in an average of one observation every 5 days, in 
each filter (jFrieman et al.l l2008). 

The simulated supernovae are fit with MLCS2k2 (jjha et al.l 120071 ) to determine a distance. MLCS2k2 
fits the multi-color light curves for the parameters /i, Ay, A, and i m ax- In all cases we fix the value of Ry 
to match what is done in the SDSS-II SN analysis. We use an exponential prior on Ay, with a mean of 
1/3, and a flat A prior, with —0.4 < A < 1.8 and tapered a = 0.1 Gaussian ends. We do not include any 
redshift error, so the redshift we input to MLCS2k2 is the true redshift. When we want to set the extinction 
or A residual to zero - that is, we are looking at the distance residual as a function of the error in other 
parameters - we fix its value in the distance fit. No cuts are made based on the x 2 of the fit. 



3. RESULTS 

We first look at the effect of each component of the contaminating distance function on the distance 
residuals in Sections 13.11 to 13.41 we then investigate the correlation between A and Ay in Section 13.51 and 
finally, we simulate the full distance residual when all the components are non-zero in Section 13.61 For 
most cases we simulate 100 SNe per type and redshift; since we also investigate the effect of the survey 
magnitude limit on the contaminating distance distribution in Section 13.61 we increase this to 500 SNe 
per type and redshift. When 6M = 0, we simulate the CC SNe with an absolute V-band magnitude of 



-19.5 + 5 log(iJ /65) (|Jha et al J 120071 ). where H = 71 km s _1 Mpc -1 ; when SA = 0, we fix Ay to its true 
value in the MLCS2k2 fit; and when 5 A = 0, we fix A in the MLCS2k2 fit to be for CC SNe and its true 
value for SNe la. Unless otherwise noted, we use only Nugent template light curve shapes, and we do not 
give the CC SNe an intrinsic color variation. The main exception to this is the case where all residuals are 
allowed to vary, in Section [ 
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3.1. Zero-point Residuals: 5/j, = 5m — 5K 

The zero-point residuals are caused by the error in the best-fit time of peak brightness i max , the erroneous 
calculation of the K-corrections, and general random noise in the apparent magnitude light curves. We fix 
SM = SA = SA = as described above, and the distance residual S/i is the best-fitting fj, minus the true 
distance modulus, which is calculated from the redshift and cosmology. The distributions of distance residuals 
are very narrow and centered on a different value for each CC type, and they get wider at higher redshifts 
from increased noise. The IIL residual distributions are also wider than the other types at all redshifts, 
as shown in Figure [5] for z — 0.275. Both the means and widths of the zero-point residual distributions 
are a strong function of the distribution of output t max values, which affect the measured peak apparent 
magnitudes of the SNe. The output t max of SNe IIP is often several days past the true t max , but since their 
light curves are rather flat the difference in apparent magnitude is less; SNe IIL have the widest range of 
output i max , giving them zero-point residuals that are fainter than the other CC SNe; and SNe Ib/c have 
the narrowest light curves, so their output i max are well constrained. 



Zero-point Residuals, z = 0.275 
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Fig. 2. — The distance residuals caused by the zero-point (top left), A (top right), extinction (bottom left), 
and absolute magnitude (bottom right) residual distributions at z = 0.275. SNe Ib/c are shown in blue, SNe 
IIL in violet, and SNe IIP in red. Note that the y-axes and bin sizes are varied to make each component's 
histogram clear, but the x-axes are held fixed to stress that the SM distribution causes much more variation 
in 5jjL than the other components. 
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3.2. Delta Residuals: Sfi = 8m - SK - SA 

In these simulations we allow A to vary - that is we do not fix A = in the MLCS2k2 distance fit - but 
fix SA = SM = 0. Since A is a parameter describing the luminosity-width relation of the SN la light curves, 
CC SNe have no such parameter and so the true A is defined to be 0. The distance residual Sfi is then given 
by the zero-point residual minus the A residual, <5A = /(A) = 0.736A + 0.182A 2 . As in the previous section, 
the distributions of the residuals differ for the different CC SNe types, and they get wider as the redshift 
increases. They are in general not well-approximated by a Gaussian function (as seen in Figure [2]) and are 
correlated with the output A: lower A means the SN is interpreted as being intrinsically brighter, and the 
correction to this moves the output distance farther away (fainter) and so increases S/i. It is also interesting 
to note that the broad widths of Type IIP or "Plateau" supernovae cause them to have generally negative 
output A values. A large fraction of these have A < —0.5, which is outside the range used in the training of 
MLCS2k2, and such objects are commonly excluded from a sample used to determine cosmology. 



3.3. Extinction Residuals: 8/j, = Sm — SK — SA 

In this simulation we do not fix Ay in the MLCS2k2 distance fit, but fix <5A ~ SM — 0. The extinction 
residual, 5 A, is the output Ay minus the true Ay, which is a random value for each SN drawn from an 
exponential distribution with an average of 0.35. The output extinction parameters are generally greater 
than the input Ay (SA > 0), so Sfi < and the SNe are interpreted as being nearer than t hey really are. 
This is because in general, CC SNe are redder than SNe la at peak ( Poznanski et al. 2002h . and so when 



applying SN la templates to fit for CC light curves, this color difference is interpreted as higher extinction. 
As in the case of the A residuals, the residual distribution functions are not well-approximated by Gaussian 
functions, and there is generally evidence for a tail toward the fainter end. 



3.4. Magnitude Residuals: S[i = 5m — SK — SM 

In this simulation, Ay and A are fixed in the distance fit, and the absolute magnitude residual SM 
is the SN la magnitude, Mi a , minus the true absolute magni tude. We give the CC S Ne random absolute 



magnitudes according to the Gaussian luminosity functions in iRichardson et al.l ((2002J) (see Tabled]). Most 



SNe are thus much fainter than in previous simulations, where the absolute magnitude was set to the 
characteristic SN la absolute magnitude, and most of the SNe at high redshift are too faint to pass the S/N 
selection criteria (as we will see in Section T3.6p . Figure [5] shows the distance residuals at a median z of 
0.275 for each of the previous components; it is clear that the largest variation in S[i is caused by SM due 
to the wide CC luminosity functions. It is also notable that each CC type has distinct values for each of the 
residual components, implying that the contaminating distance distribution will depend on which types of 
CC SNe are in the final sample used to measure cosmology. 



3.5. Delta and Extinction Correlation: Sfj, = Sm — SK — SA — 5 A 

When both Ay and A are allowed to vary in the distance fit, the best-fit values of the extinction and 
A become correlated, and this correlation varies with both redshift and CC type. Compared to previous 
5 A, the Ibc and IIP extinction residual distributions are largely unchanged when A is allowed to vary, but 
the IIL extinction residuals shift toward SA = (e.g. the amount by which MLCS2k2 is overestimating 
the extinction becomes less). As seen in Figure the SNe Ibc and IIL show a trend of decreasing A with 
increasing Ay, while SNe IIP have very low values of the output A and a mean output Ay that decreases 
with increasing z. SNe IIL show a drastic evolution in the trend with z: at low z, the range of output A is 
relatively narrow and positive, and the range of output Ay is wide; at high z, the range of output A becomes 
wide and extends to negative values, while the range of output Ay becomes narrow and low. 
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0.0 0.5 1.0 1.5 2.0 
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Fig. 3. — The output A vs. output Ay when both are allowed to vary in the distance fit {b[i = 5m — 5K — 
5A — 5A), for z — 0.075 (top) and z = 0.325 (bottom), showing the correlation between A and Ay and its 
evolution with redshift. SNe Ib/c are shown in blue, SNe IIL in violet, and SNe IIP in red. The input Ay 
values are drawn from an exponential distribution with an average of 0.35, and the "true" A values are for 
CC SNe. 



The trend in the output A vs. Ay and its evolution with redshift, most apparent for SNe IIL, can be 
understood to be a result of the dependency of SN la max-light colors on the A parameter. As the best-fit 
A changes, then, the assumed intrinsic color changes, which will change the value of the measured color 
excess for a given meas ured color ; if the assumed intrinsic color is bluer, the best-fit extinction increases, and 
vice-versa. Figure 8 of Ijha et al.l (|2007l ) shows the max-light U — B, B — V, and V — I colors as a function 
of A: as A increases, the U — B and B — V colors increase rapidly while V — I is much more flat. Thus if 
the light curve is fit to have a high value of A, the redder U — B and B — V SN la colors mean the color 
difference between CC and la SNe is less and the best-fit extinction is less, which matches the observed trend 
in Figure [31 

This relationship between A and Ay remains when the CC SNe are simulated with an intrinsic color 
variation, where the magnitudes in one filter are given a random shift with respect to the magnitudes in 
another. When such a dispersion is added to the CC light curves, there is about 0.1 mag of dispersion added 
to the distribution of distance residuals. 



3.6. All Residuals: Sfi = 5m - 8K - SA- 5A~ 5M 
We now allow all components of the distance residuals to vary. In addition, in this sim ulation we give 



each SN a random intrinsic color variation, such that the dispersion in color is 0.15 mag (| Sullivan et al 



2006) , and we also add variation in the light curve shapes by introducing the CSP templates in addition to 
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the Nugent templates, such that each individual supernova has an equal chance of using either template. 
This is an attempt to characterize a wider range of output values than is possible with one set of templates, 
since a real supernova survey will observe the full gamut of SN light curve shapes, colors, etc. 

The results of the previous sections are largely unchanged: each SN type has distinct values of the 
various residual components, the output A and Ay show the same correlations, and the 8M distribution 
continues to dominate the distribution of Sfi while the other components cause Sfj, to shift with respect to 
SM. Additionally, some of the components show a bi-modality, where the CSP SNe are separated from 
the Nugent SNe by as much as a few tenths of a magnitude. This suggests that the SN light curve shapes 
and colors have a strong impact on the best-fitting parameters in the distance fit and is a warning that the 
Nugent templates alone (plus Gaussian random colors) do not represent the full range of observable SNe; 
however, the small differences in some of the component residuals are washed out in the S/i distribution due 
to the large variation in absolute magnitudes. 

To mimic the observational effects of a real supernova survey, we require that the supernovae have at 
least one epoch in each filter with signal-to-noise S/N > 5.0. This removes simulated supernovae that are 
not likely to be observed because they are in the noise, and additionally applies a minimal quality cut to the 
survey. Such quality cuts attempt to maintain a high precision in the final cosmological analysis. 

When S/N cuts are applied to SNe Ib/c, IIL, and IIP, the S/i distributions of all three SN types show 
a drastic evolution with redshift. The output Ay of the SNe that survive at high redshift continue to be 
rather high while the input Ay decreases. The mean absolute magnitude gets brighter and the mean distance 
decreases (gets closer) with redshift, for all SN types, as more and more intrinsically faint SNe fail to pass 
the S/N cut. This selection effect also causes the standard deviation of the distance and magnitude residuals 
to decrease with redshift. Figure H] shows the mean, standard deviation, and skewness of the distance and 
magnitude residual distributions as a function of redshift, for all 3 CC types. We discuss each type in turn 
below. 



The mean of the SN Ib/c (shown in blue) distance and magnitude residuals decrease with about the 
same slope (the other components produce an offset between them) , and their standard deviations are 
also similar, with the distance distribution becoming slightly more spread relative to the magnitude 
distribution at higher redshifts. The skewness of the distance residual distribution is slightly negative, 
and its evolution with redshift also follows that of the absolute magnitude distribution. This suggests 
that for SNe Ib/c, the distribution of contaminating distances can be parameterized by the absolute 
magnitude luminosity function (for the case when this function is Gaussian with a large width) plus 
the effects of the survey magnitude limit. 

The mean of the SN IIL (shown in violet) absolute magnitude residuals decreases a bit more steeply 
than that of the distance residuals, due to the changing extinction residuals. Note however that in the 
highest redshift bins only a few out of the initial 500 SNe IIL have passed the S/N cut, as they are less 
likely to have very bright objects than are SNe Ib/c given the simulated luminosity functions. Because 
the width of the SN IIL extinction residual distributions are larger than that of the SN Ib/c, and 
the width of Richardson's SN IIL luminosity function is smaller than that of the SN Ib/c luminosity 
function, the distribution of contaminating distances and the absolute magnitude luminosity function 
(after accounting for selection effects) are not as similar for SNe IIL as they are for SNe Ib/c. 

The means of the SN IIP (shown in red) distance and absolute magnitude residuals decrease with 
about the same slope and their standard deviation and skewness are very similar until z ~ 0.275, at 
which point the number of surviving SNe is small and statistics become very uncertain. Thus as long 
as SNe IIP make it into the sample, it appears that the distribution of contaminating distances may be 



param eterized by the absolute magnitude luminosity function, if it is as wide as in iRichardson et al 



(2002). 
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Mean of 6jx and -<5M Standard Deviation of 6/j. and —6M 




Fig. 4. — The mean, standard deviation, and skewness of the distance (Sfj,; solid) and absolute magnitude 
(— SM; dashed) residual distributions as a function of redshift for all 3 CC types, as well as the number of 
SNe per redshift that pass the signal-to-noise cut. SNe Ib/c are shown in blue, SNe IIL in violet, and SNe 
IIP in red. Note that no SNe IIP pass the cut at the highest redshift, so there are no SNe IIP statistics at 
this redshift. The distance residual distribution is a strong function of the magnitude distribution for SNe 
Ib/c and IIP, but they seem to be less correlated for SNe IIL. 

It is also general practice to remove SNe from the sample whose light curves have fitted values that lie 
outside the range used to train the distance fitter. Thus we also look at the effects of removing all SNe with 
an output A < —0.4 and at the same time applying the above S/N cut. This mainly affects the SNe IIP, 
as the SNe Ib/c and SNe IIL have only a few SNe with an output A below this cutoff at all redshifts. As 
mentioned previously, SNe IIP are very broad, which gives them very negative values of output A. For the 
lowest redshift, z — 0.075, about 80% of the SNe IIP are removed; this increases to 96% by z = 0.225. Below 
this redshift, the previous result remains the same, with the mean, standard deviation, and skewness of the 
distance residuals following that of the absolute magnitude residual distributions. This suggests, however, 
that due to their very broad light curves, combined with their faint absolute magnitudes, SNe IIP may not 
be a significant contributor to the contamination of the SN la cosmology analysis. Thus the contaminating 
distance distribution will depend mostly on the SN Ib/c and SN IIL luminosity functions and the relative 
rates of theses CC SN types, combined with the effects of the survey magnitude-limit and candidate selection. 
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4. COMPARISON TO SDSS DATA 



We now address the question of whether the features of the observed SDSS SNe match those of the 
simulated SNe and can be explained by the main results presented above. 

In addition to the CC SNe presented in Section [3T6l we simulate SNe la as described in Section[2j varying 
the A, Ay, absolute magnitude, and colors and applying the same selection cuts. To determine how many 
of each type of SN we would expect to observe, we calculate the number of SN explosions per day per deg 2 
as a function of reds hift from the intrins ic SN rates per year per comoving volume element: for CC SNe, 
we use the rates from lDahlen et ah ( 2004 ) and split them into the CC subtypes using the fractions given in 
Table [TJ and for SNe la, we use the rates from Dahlen et al. ( 20081 ). We then multiply this by the fraction, 
as a function of redshift and type, of SNe that we expect to pass the S/N requirements based on the results 
of our simulations (i.e. see the bottom right panel of Figure HJ note that the CC component is dominated 
by type Ib/c above z — 0.25). The number of supernovae per redshift for all SN types is normalized in such 
a way as to produce the best match between data and simulations after all cuts are applied. 

To compare the simulations to the SDSS data, we use the first-year sample of identified SNe la and CC 
SNe, as well as unidentified (spectroscopically) SNe, with the requirement that the SN has a spectroscopic 
redshif t from the host galaxy or the supernova itself to avoid any biases caused by photometric redshift 
errors |fflozek et al.ll20ld ISmithll2009T ). These SNe satisfy the following conditions: observations on at least 
5 different epochs; at least one epoch with S/N > 5 in each of the g, r, and i bands; at least one observation 
at lea st 2 days before pe ak light, in the rest-frame; and at least one observation at least 10 days after peak 
light ( Dildav et al. 2008h . (Note that the simulated SNe are required to satisfy the same conditions.) Adding 
the requirement that A > —0.4 removes 40-50% of the SDSS SNe in the lowest two redshift bins; since it 
is our goal to compare the core-collapse distributions of the SDSS data to the simulations, we do not apply 
this cut to either the SDSS data or the simulations. 

After adjusting the SDSS distances to have the same value of Hq that is used in the simulations, we 
calculate the distance residual of each SN by subtracting the distance modulus at the SN redshift (calculated 
using the same fiducial cosmological parameters as the simulations) from this adjusted distance. Since the 
true cosmological parameters may not match those used in the simulations, there may be some small shift 
in residual space between the SDSS and simulated distance residuals. (For example, a 15% change in VLm 
or a 10% change in wq would shift the SDSS distance residuals by ~ 0.005 mag at z = 0.05 and by ~ 0.03 
mag at z = 0.45, and a 10% change in Hq would shift them by ~ 0.22 mag at all redshifts.) We combine 
the SDSS SNe into 8 equal-width redshift bins, using the discrete redshifts of the simulated SNe as the bin 
centers. 

The simulated and SDSS distance residuals are in general agreement, but the simulated SNe contain 
outliers of very close CC SNe, with Sfi < 0, that do not appear in the SDSS data. After investigating several 
possibilities, discussed below, we determine that it is most likely that the absolute magnitude distributions 
used to simulate the CC SNe are too bright by ~ 1 mag, and we re-run the simulation with fainter CC SNe 
to find better agreement (see Figure [SJ. 

(a) CC vs. Ia rate too high: The excess of bright CC SNe causing large negative distance residuals 
may be a result of including too many simulated CC SNe with respect to SNe Ia; this would mean that while 
these bright SNe occur, they are rarer than assumed for the simulated data and so are not showing up in 
the SDSS data. The ratio of t he core collapse to Ia r ate used to populate the simulation distance residuals 
is Rcc/Ru = 3.6 at z = 0.3. iBotticella et all (|2008|) determines R cc = 1.15 at z = 0.21 and R u = 0.34 
at z = .3; using the ir Figu re 10 to extrapolate the CC rate to z = 0.3, we estimate that Rcc/Ria ~ 4 at 
z = 0.3. iBazin et al.1 (|2009) finds an even larger value of Rcc/Ria = 4.5 at z ~ 0.3. Thus the relative CC 
to Ia rate used in the simulations is less than other values in the literature and cannot cause the excess of 
bright simulated CC SNe. 

(b) Fraction of Ib/c too large: If the relative number of simulated SNe Ib/c with respect to 
SNe II is too high, that would contribute to there being an excess of bright CC SNe that is not ob- 
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using the absolute magnitu des of 



Richardson et al 



Ib/c luminosity function of iRichardson et al.l (|2006l ). as well fainter CC absolute magnitudes, are in blue. 
There is good agreement between the SDSS residuals and the fainter simulation, while the brighter simulation 
contains large negative outliers at z > 0.275. 



served in the SDSS data, since the SNe Ib/c are more likely to be very bright than are SNe IIL or IIP. 
The fraction of SNe Ib/ c with re spect to SNe II we use is 24% (see Table [D, wh ile other studies give 
29% JSmartt et al.ll2009l). 26.5% jLi et alj |2007h . 24.7% (Ivan den Bergh et all boosh. 24.6% |Prieto et al 



200$) . 22.3% (jCappellaro et al.lll999t ). and 24% ijBazin et al.ll2009t ). Though there is quite a bit of 



uncer- 
tainty in the relative fractions of the CC subtypes, the Ib/c fraction we use is consistent with the literature. 

(c) CC SNe too red: If the simulated CC SNe are redder than the SDSS CC SNe, this would 
contribute to the discrepancy between the simulated and SDSS distance residuals, since this extra reddening 
would be interpreted as an effect of dust thereby increasing 8A and decreasing Sfi. To investigate this 
possibility we look at the difference between the rest-frame max-light colors of our simulated CC SNe and 
th ose of the template SN e la used by the distance fitter, and we compare these color differences to those 



Poznanski et alJ (|2002l ) (acknowledging that the available data on CC colors is limited). We conclude that 



the Nugent CC and CSP Ib/c templates provide reasonable representations of CC colors, while the CSP IIL 
template is quite r ed, especially in t he blu er bands, and the CSP IIP template is only slightly redder than 
the colors given in iPoznanski et al] (|2002[ ). Since the simulation outliers are composed equally of CSP and 
Nugent templates and dominated by SNe Ib/c, they cannot be caused by overly- red CC SNe. 

(d) SDSS SN candidate selection: We must also remember that the SDSS-II SN survey follow- 
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up strategy was designed to target SNe la preferentially ( Sako etlrtl 2008 : Kessler et al. 2009a ). so it is 
possible that the SDSS candidate selection process excludes (bright) CC SNe which remain in the final 
simulated sample. In order for a SN candidate to make it into the SDSS sample we are considering, it 
must have a spectroscopic redshift from either the host galaxy or the SN itself. SN la candidates (SNe that 
are likely to be la) are identified by matching the available ph otometry to templates, and these are given 
the highest priority for spectroscopic follow-up ( Sako et al. 20081 ). Host-galaxy spectroscopic redshifts were 
either obtained from the SDSS-I galaxy redshift survey or further follow-up observations of the host galaxies 
of ~ 80 high quality la- like candidates ( Sako et al. 20081 Dildav et al. 2008 ). which again selects against CC 
SNe having spectroscopic redshifts. 

Though we can't simulate exactly the spectroscopic follow-up decisions, we should point out, however, 
that almost all bright transients with r < 20.5 mag were targeted for spectroscopic follow-up observa- 
tions ( Sako et al. 2008), so any CC SNe that are this bright would have appeared in the SDSS data. We find 
13 SNe Ib/c, 11 SNe IIL, and 4 SNe IIP that meet this criteria in the simulated sample (and 10, 10, and 3 
respectively after applying a 2tr truncation to the absolute magnitudes) , and these have the largest negative 
distance residuals - this is a clue that the simulated CC magnitudes are too bright. 



(e) CC absolute magnitudes too bright: The CC luminosity functions we used are from lRichardson et al 



(|2002l ) (hereafter R02; see Table [IJ; due to the limited number of SNe used in that paper to build the abso- 
lute magnitude distributions, there is considerable uncertainty in both the mean and width of the CC LFs, 
and indeed whether they can/should be fit with Gaussian functions. The simulated distances would come 
out closer than the SDSS data both if the mean of the distributions were too bright and if the width were 
too large. As to the mean, R02 acknowledge that there may be a luminosity bias in their CC distributions, 
since fainter SNe, if they exist, are less likely to be observed than the brighter SNe. Their SNe la luminosity 
function is most likely complete, both because SNe la ar e brighter and because R 02 find many faint outliers. 
As to the width of the absolute magnitude distributions. iRichardson et al.l (|2006i ) (hereafter R06) look at the 
absolute magnitudes of stripped-envelope, or Type Ib/c, SNe and find the same mean absolute magnitude 
but a much reduced width of 0.9 mag, as compared to 1.4 mag in R02. Again, as in R02, R06 note that 
there may be a luminosity bias in their sample for which they do not attempt to correct. 

We test whether this discrepancy is the result of unexpected absolute magnitudes in the randomly- 
generated distribution by removing simulated SNe with absolute magnitudes greater than 2er from the 
mean, where a is the R02 width for each CC type. We find that the furthest outliers are removed from the 
simulated sample, but outliers remain that are 1-2 mag less than the SDSS residuals. 

To estimate the amount by which the simul ated absolute ma gnitudes may be too bright, we look at 
a recent sample of CC and la SNe from SNLS jBazin et al.ll2009t ). In particular, they define a "pseudo- 
absolute magnitude" AM570, which is proportional to the absolu te magnitude accor ding to the supernova's 
distance but ignoring the effect of dust absorption. Figure 11 of lBazin et all |2009t) shows the distribution 
of AM570 both before and after correcting for the detection efficiency and volume of the SNLS survey; after 
this correction, the CC component seems to be 2-2.5 mag fainter than the la component, and there is an 
increase in the number of events with AA/570 > 4 mag fainter than the SNe la that may account for the 
fainter SNe IIP. In R02, however, the difference between the mean absolute magnitudes of SNe Ib/c/IIL and 
SNe la is only 1.4 mag. From this we conclude that the simulated CC SNe could be ~ 1 mag too bright; 
however, since there remain simulated CC distance residuals which are greater than 1 mag removed from 
the SDSS sample, it is also possible that the Ib/c width of 1.4 mag should be reduced to the R06 width of 
0.9 mag. 

To determine if a sufficient match can be made between SDSS data and simulations with fainter CC SNe, 
we repeat the simulation after adding 1 mag to the mean of the R02 absolute magnitudes and decreasing the 
width of the Ib/c luminosity function to have the R06 value. As expected, the outliers caused by very bright 
CC SNe no longer appear in the simulated distance distributions; indeed, the shapes of the simulated and 
SDSS residual distributions are very similar (see Figure [S]). No CC SNe of any type pass the S/N cut for 
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z > 0.375, which is consistent with the SDSS data: of the 292 SNe in the SDSS first-year BEAMS sample, 
only 57 a re not categorized as either co nfirmed la (from spectra) or likely la (from photometry or low S/N 
spectra) ( Sako et al. 2008 ; Smith 20091) . and all of these 57 have z < 0.35. Though this does not mean that 
we can say whether the fainter luminosity functions are the true CC luminosity functions, the agreement 
between data and simulations is encouraging, and it suggests that we may be able to use the simulations to 
parameterize the contaminating distance distribution for Bayesian supernova cosmology. 



5. CONCLUSIONS 

Through the use of detailed simulations of the photometric observations of CC SNe, we have attempted 
to characterize and account for the features of the distribution of their distances as determined by the Type 
la MLCS2k2 distance fittei@ in order to understand how CC SNe contaminate a sample of SNe la distances. 
We have found that the contaminating distance distribution for a given CC type can be characterized by its 
luminosity function plus a shift in magnitude due to the other components of the distance fit, and its evolution 
with redshift is determined by the selection effects of a magnitude-limited survey. The full distribution of 
contaminating distances will depend on the relative amounts of the CC types, and its evolution with redshift 
will be affected by the redshift-dependence of the intrinsic CC rates. Finally, the details of the shape of 
the contaminating distance distribution will depend on the various selection cuts and qualifiers that go 
into building the SN sample used to calculate the cosmological parameters. Thus to better inform the 
contaminating likelihood function we need to know: 

1. the SN Ib/c, IIL, and IIP absolute magnitude luminosity functions; 

2. the relative rates of the CC types; and 

3. the selection efficiencies as a function of redshift, SN type, etc. of the survey. 

It is not necessary that all of these quantities be known exactly, since the contaminating distribution may 
be parameterized in the Bayesian framework and the parameters marginalized over, but the choice of pa- 
rameterization will act as a strong prior on the final results. It is also possible to solve for the parameters 
describing the contaminating distribution instead of the cosmology, given a large sample of SN distances, 
though again the results will depend strongly on the parameterization scheme. 

One avenue of future work is to investigate whether the contaminating distribution can be characterized 
in a non-parametric way by the simulated 5/i distributions themselves. Parameters such as a global shift in 
magnitude may be added and marginalized over, but the simulations would provide the width and shape 
of the contaminating distribution. This would be similar to model ing the survey selection efficiencies in 



spectroscopic SN la samples (see, for example, iKessler et alJ (|2009aT )). We would need to have confidence 



that the simulations accurately represent the probability distribution that the data will be drawn from (as 
a function of distance and redshift), or that the distribution may be varied by a free parameter which is 
marginalized over without this introducing any biases. 

Finally, we would like to stress that these results depend on having accurate redshifts. Photometric 
surveys that rely on photometric redshifts instead of host-galaxy spectroscopic redshifts will need to worry 
about how the redshift errors will affect the contaminating distance distribut ion, as well as th e distances 
of the SNe la and the derived cosmological parameters. One recent paper (|Gong et al.ll2010h addresses 



this question by fitting simulated la and CC SNe to a set of la templates to determine simultaneously the 
redshift and the distance, and they use the BEAMS method to determine the recovered cosmology. We note, 



4 Note that the main results would be the same but the specifics of how the CC distance function depends on the various 
component parts (apart from SM) would change if a distance-fitter other than MLCS2k2 is used. 
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however, that they do not take into account the full CC luminosity functions, which we have found to be the 
dominant component of the contaminating distance distribution, nor do they simulate the selection effects 
of a magnitude-limited survey, which we have found to affect the evolution of the CC distance distribution 
with redshift. Thus a more complicated contaminating distribution function will be needed. 

We hope that by highlighting the dominant features affecting the contaminating distribution function, 
this work will improve both the analyses of future supernova cosmology surveys and the simulations of their 
effectiveness. We recommend that future surveys strive to obtain representative spectroscopic follow-up, 
targeting the full gamut of observed SNe, in order to better understand the contaminating population that 
will affect any photometric supernova cosmology measurement. 
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